A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition
نویسندگان
چکیده
Missing feature methods of noise compensation for speech recognition operate by first identifying components of a spectrographic representation of speech that are considered to be corrupt. Recognition is then performed either using only the remaining reliable components, or the corrupt components are reconstructed prior to recognition. These methods require a spectrographic mask which accurately labels the reliable and corrupt regions of the spectrogram. Depending on the missing feature method applied, these masks must either contain binary values or probabilistic values. Current mask estimation techniques rely on explicit estimation of the characteristics of the corrupting noise. The estimation process usually assumes that the noise is pseudo-stationary or varies slowly with time. This is a significant drawback since the missing feature methods themselves have no such restrictions. We present a new mask estimation technique that uses a Bayesian classifier to determine the reliability of spectrographic elements. Features used for classification were designed that make no assumptions about the corrupting noise signal, but rather exploit characteristics of the speech signal itself. Experiments were performed on speech corrupted by a variety of noises, using missing feature compensation methods which require binary masks and probabilistic masks. In all cases, the proposed Bayesian mask estimation method resulted in significantly better recognition accuracy than conventional mask estimation approaches. 2004 Elsevier B.V. All rights reserved. 0167-6393/$ see front matter 2004 Elsevier B.V. All rights reserv doi:10.1016/j.specom.2004.03.006 * Corresponding author. Address: Microsoft Research, 1 Microsoft Way, Redmond, WA 98052, USA. Tel.: +1 425 706 3763; fax: +1 425 936 7329. E-mail addresses: [email protected] (M.L. Seltzer), [email protected] (B. Raj), [email protected] (R.M. Stern).
منابع مشابه
Classifier-based mask estimation for missing feature methods of robust speech recognition
Missing feature methods of noise compensation for speech recognition operate by removing components of a spectrographic representation of speech that are considered to be corrupt, as indicated by a low signal-to-noise ratio. Recognition is either performed directly on the incomplete spectrograms or the missing components are reconstructed prior to recognition. These methods require a spectrogra...
متن کاملNeural Network Based Missing Feature Method For Text-Independent Speaker Identification
The first step of missing feature methods in text-independent speaker identification is to identify highly corrupted spectrographic representation of speech as missing feature. Most mask estimation techniques rely on explicit estimation of the characteristics of the corrupting noise and usually fail to work with inaccurate estimation of noise. We present a mask estimation technique that uses ne...
متن کاملOn the Relation between Statistical Properties of Spectrographic Masks and Recognition Accuracy
Missing Data Techniques (MDT) can significantly improve the accuracy of automatic speech recognition (ASR) for speech corrupted by background noise. The increase in recognition accuracy obtained using MDT is largely dependent on the estimation of spectrographic masks used to distinguish speech from noise. We present an analysis technique which enables us to compare two mask estimation technique...
متن کاملReconstruction of missing features for robust speech recognition
Speech recognition systems perform poorly in the presence of corrupting noise. Missing feature methods attempt to compensate for the noise by removing noise corrupted components of spectrographic representations of noisy speech and performing recognition with the remaining reliable components. Conventional classifier-compensation methods modify the recognition system to work with the incomplete...
متن کاملMask classification for missing-feature reconstruction for robust speech recognition in unknown background noise
“Missing-feature” techniques to improve speech recognition accuracy are based on the blind determination of which cells in a spectrogram-like display of speech are corrupted by the effects of noise or other types of disturbance (and hence are “missing”). In this paper we present three new approaches that improve the speech recognition accuracy obtained using missing-feature techniques. It had b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 43 شماره
صفحات -
تاریخ انتشار 2004